Display apparatus and method for recognizing voice

ABSTRACT

A display apparatus which is capable of recognizing a voice and a method thereof are provided. The method includes receiving an uttered voice of a user, extracting a plurality of similar words which are similar to the uttered voice by extracting voice information from the uttered voice and measuring reliability of a plurality of words based on the extracted voice information, setting a word satisfying a predetermined condition from among the plurality of extracted similar words as a target word with respect to the uttered voice, and displaying at least one of the target word and a similar word list including similar words other than the target word. In this manner, a display apparatus may improve a recognition rate on an uttered voice of a user without changing an internal component related to voice recognition, such as, an acoustic model, a pronunciation dictionary, or the like.

CROSS-REFERENCE TO RELATED APPLICATION

This application is based on and claims priority under 35 U.S.C. §119 toKorean Patent Application No. 10-2014-0112370, filed on Aug. 27, 2014 inthe Korean Intellectual Property Office, the disclosure of which isincorporated by reference herein in its entirety.

BACKGROUND

1. Field

The disclosure generally relates to a display apparatus and a methodthereof, and for example, to a display apparatus which is capable ofrecognizing an uttered voice of a user and a method thereof.

2. Description of Related Art

Generally, in response to a user's uttered voice being received, avoice-recognizable display apparatus compares the received uttered voicewith a plurality of pre-registered words and sets a word having highreliability as an execution command with respect to the user's utteredvoice.

However, in this case, when there are a plurality of similar words whichare similar to the user's uttered voice, a similar word which does notmeet an intention of the user may be set as an execution command withrespect to the user's uttered voice.

In the related art for resolving such problem, a method for recognizinga voice includes assigning a critical value to each of a plurality ofpre-registered words and setting a word having a reliability valuehigher than the critical values of the plurality of pre-registered wordsas an execution command. However, such method has a problem in which anexecution command is set based on a user's uttered voice, and thus, whenthere are a plurality of similar words which are similar to the user'suttered voice, a particular similar word is set as the executioncommand.

Another method for recognizing a voice in the related art includesproviding a plurality of similar words which are similar to a user'suttered voice in a form of a list and setting a similar word selected bythe user as an execution command. However, when there are a plurality ofsimilar words which are similar to the uttered voice, this methodprovides a list of the plurality of similar words, and thus, lacks apractical use in terms of convenience of controlling an operation of adisplay apparatus through a user's uttered voice.

SUMMARY

The disclosure has been provided to address the aforementioned and otherproblems and disadvantages occurring in the related art, and an aspectof the disclosure provides a display apparatus for enhancing arecognition rate on an uttered voice of a user.

According to an example embodiment, a method for performing voicerecognition on an uttered voice of a user in a display apparatus isprovided, the method including: receiving an uttered voice of a user,extracting a plurality of similar words which are similar to the utteredvoice by extracting voice information from the uttered voice andmeasuring reliability on a plurality of words based on the extractedvoice information, setting a word satisfying a predetermined conditionfrom among the plurality of extracted similar words as a target wordwith respect to the uttered voice, and displaying at least one of thetarget word and a similar word list including similar words other thanthe target word.

The voice information may be pronunciation information on text convertedthrough voice recognition on the uttered voice.

Extracting the plurality of similar words may include extracting aplurality of similar words which are similar to the uttered voice basedon a reliability value calculated from similarity between apronunciation defined for each of the plurality of words and apronunciation with respect to the uttered voice. In addition, settingthe word satisfying the predetermined condition may include comparing areliability value determined for each of the plurality of similar wordsand a critical value assigned to each of the similar words and setting asimilar word having a reliability value higher than the critical valueassigned to each of the similar words as a target word with respect tothe uttered voice.

The method may further include setting an execution command. In responseto an execution command of the user not being received for apredetermined critical time or in response to a selection command withrespect to the target word being received, setting the execution commandmay include setting the target word as an execution command, and inresponse to a selection command with respect to the similar word listbeing received, setting a similar word corresponding to the selectioncommand as an execution command.

The similar word list may be a list including other similar words thanthe target word and different symbolic letters being matched with theother similar words. In response to the selection command being anuttered voice related to a symbolic letter, setting the executioncommand may include setting a similar word which is matched with asymbolic letter similar to the uttered voice from among the similarwords in the similar word list as an execution command.

The method may further include adjusting a critical value assigned tothe similar word set as the execution command from among the pluralityof similar words including the target word.

In response to the target word being set as an execution command,adjusting the critical value may include decreasing a critical value ofthe similar word set as the target word by a predetermined adjustmentvalue.

In response to a similar word included in the similar word list beingset as an execution word, adjusting the critical value may includedecreasing a critical value of the similar word set as the executioncommand by a first adjustment value and increasing a critical value ofthe similar word set as the target word by a second adjustment value.

In response to the plurality of similar words which are similar to theuttered voice being extracted, extracting the plurality of similar wordsmay include grouping the plurality of extracted similar words into asimilar word group.

In response to the similar words extracted in connection with theuttered voice being grouped into a similar word group, extracting theplurality of similar words may include extracting all words in thesimilar word group as a similar word related to the uttered voice.

According to an example embodiment, a display apparatus is providedincluding: an input circuit configured to receive an uttered voice of auser, a display configured to display a voice recognition result basedon the uttered voice, a voice processor configured to extract aplurality of similar words which are similar to the uttered voice byextracting voice information from the uttered voice and measuringreliability on a plurality of words based on the extracted voiceinformation, and a controller configured to set a word satisfying apredetermined condition from among the plurality of extracted similarwords as a target word with respect to the uttered voice and to controlthe display to display at least one of the target word and a similarword list including similar words other than the target word.

The voice information may be pronunciation information on text convertedthrough voice recognition on the uttered voice.

The voice processor may be configured to extract a plurality of similarwords which are similar to the uttered voice based on a reliabilityvalue determined from similarity between a pronunciation defined foreach of the plurality of words and a pronunciation with respect to theuttered voice. In addition, the controller may be configured to comparea reliability value determined for each of the plurality of similarwords and a critical value assigned to each of the similar words and seta similar word having a reliability value higher than the critical valueassigned to each of the similar words as a target word with respect tothe uttered voice.

In response to an execution command of the user not being received for apredetermined critical time or in response to a selection command withrespect to the target word being received, the controller may beconfigured to set the target word as an execution command, and inresponse to a selection command with respect to the similar word listbeing received, may set a similar word corresponding to the selectioncommand as an execution command.

The similar word list may be a list including other similar words thanthe target word and different symbolic letters being matched with theother similar words. In response to the selection command being anuttered voice related to a symbolic letter, the controller may beconfigured to set a similar word which is matched with a symbolic lettersimilar to the uttered voice from among the similar words in the similarword list as an execution command.

The controller may be configured to adjust a critical value assigned tothe similar word set as the execution command from among the pluralityof similar words including the target word.

In response to the target word being set as an execution command, thecontroller may be configured to decrease a critical value of the similarword set as the target word by a predetermined adjustment value.

In response to a similar word included in the similar word list beingset as an execution word, the controller may be configured to decrease acritical value of the similar word set as the execution command by afirst adjustment value and may increase a critical value of the similarword set as the target word by a second adjustment value.

In response to the plurality of similar words which are similar to theuttered voice being extracted, the controller may be configured to groupthe plurality of extracted similar words into a similar word group.

In response to the similar words extracted in connection with theuttered voice being grouped into a similar word group, the voiceprocessor may be configured to extract all words in the similar wordgroup as a similar word related to the uttered voice.

According to above described various example embodiments, a displayapparatus may improve a recognition rate on an uttered voice of a userwithout changing an internal component related to voice recognition,such as, for example, an acoustic model, a pronunciation dictionary,etc.

BRIEF DESCRIPTION OF THE DRAWINGS

The above and/or other aspects and advantages of the disclosure willbecome more apparent from the following detailed description taken inconjunction with the accompanying drawings, in which like referencenumerals refer to like elements, and wherein:

FIG. 1 is the first demonstration diagram illustrating an interactivesystem which provides response information corresponding to a user'suttered voice according to an example embodiment;

FIG. 2 is the second demonstration diagram illustrating an interactivesystem which provides response information corresponding to a user'suttered voice according to an example embodiment;

FIG. 3 is a block diagram illustrating a display apparatus according toan example embodiment;

FIG. 4 is a demonstration diagram illustrating an operation of setting atarget word in a display apparatus according to an example embodiment;

FIGS. 5A-5B are demonstration diagrams illustrating an operation ofdisplaying a voice recognition result on a user's uttered voice in adisplay apparatus according to an example embodiment;

FIG. 6 is a demonstration diagram illustrating an operation of adjustinga critical value of a similar word which is similar to a user's utteredvoice in a display apparatus according to an example embodiment;

FIG. 7 is a flowchart illustrating a method for recognizing a user'suttered voice in a display apparatus according to an example embodiment;

FIG. 8 is a flowchart illustrating a method for setting a target word ina display apparatus according to an example embodiment; and

FIG. 9 is a flowchart illustrating a method for adjusting a criticalvalue of a similar word in a display apparatus according to an exampleembodiment.

DETAILED DESCRIPTION

The example embodiments of the disclosure may be diversely modified.Accordingly, example embodiments are illustrated in the drawings and aredescribed in detail in the detailed description. However, it is to beunderstood that the disclosure is not limited to a specific exampleembodiment, but includes all modifications, equivalents, andsubstitutions without departing from the scope and spirit of thedisclosure. Also, well-known functions or constructions are notdescribed in detail since they might obscure the disclosure withunnecessary detail.

Certain example embodiments are described in detail below with referenceto the accompanying drawings.

In the following description, like drawing reference numerals are usedfor the like elements, even in different drawings. The matters definedin the description, such as detailed construction and elements, areprovided to assist in understanding of the example embodiments. However,example embodiments can be practiced without those specifically definedmatters. Also, well-known functions or constructions are not describedin detail since they might obscure the application with unnecessarydetail.

FIG. 1 is a first demonstration diagram illustrating an interactivesystem which provides response information corresponding to a user'suttered voice according to an example embodiment, and FIG. 2 is a seconddemonstration diagram illustrating an interactive system which providesresponse information corresponding to a user's uttered voice accordingto an example embodiment.

As illustrated in FIG. 1, a display apparatus 100 having an interactivesystem may be an apparatus in which internet access is available and maybe realized as diverse electronic apparatuses such as, for example, asmart television (TV), a mobile phone including a smart phone, a desktoppersonal computer (PC), a laptop PC, a navigation apparatus, etc. Inresponse to a user's uttered voice being received, the display apparatus100 is configured to perform an operation corresponding to the receiveduttered voice. For example, in response to the user's uttered voicebeing received, the display apparatus 100 may, for example, convert avoice signal regarding the received uttered voice into text. Accordingto an example embodiment, the display apparatus 100 may, for example,convert a voice signal regarding a user's uttered voice into text byusing a speech to text (STT) algorithm.

The display apparatus 100 compares a pronunciation of the text convertedfrom the user's uttered voice with a pronunciation of each of aplurality of pre-registered words and measures reliability with respectto the plurality of words. Subsequently, the display apparatus 100extracts a plurality of similar words which are similar to the user'suttered voice based on the measured reliability.

The display apparatus 100 sets a similar word which satisfies apredetermined condition from among the plurality of extracted similarwords as a target word with respect to the user's uttered voice anddisplays the target word and a similar word list including the othersimilar words in a form of a user interface (UI).

For example, in response to three similar words being extracted inconnection with the user's uttered voice, the display apparatus 100compares a reliability value of each of the three similar words with acritical value assigned to each of the three similar words and sets asimilar word having a reliability value higher than the critical valueas a target word. In addition, the display apparatus 100 generates asimilar word list regarding the two similar words other than the similarword set as the target word and displays the similar word list in ascreen in a form of a UI.

In response to a selection command with respect to the target word beingreceived or in response to a selection command not being received from auser for a predetermined time while the target word and the similar wordlist are displayed, the display apparatus 100 sets the target word as anexecution command with respect to the user's uttered voice.Subsequently, the display apparatus 100 may control an operation of thedisplay apparatus 100 or receive and display a content from a web server(not shown) based on the target word set as the execution command.

Meanwhile, in response to a selection command with respect to at leastone similar word included in the similar word list being received, thedisplay apparatus 100 sets a similar word corresponding to the receivedselection command as an execution command with respect to the user'suttered voice. Subsequently, the display apparatus 100 may control anoperation of the display apparatus 100 or receive and display a contentfrom a web server (not shown) based on the similar word set as theexecution command.

As described above, according to an example embodiment, the displayapparatus 100 may provide a plurality of similar words which aredifficult to distinguish from the user's uttered voice and may executecontrol based on a similar word having high frequency of use to beexecuted by priority. Accordingly, the display apparatus 100 may reduceand/or minimize an error of a voice recognition result on an executioncommand that the user intended.

Meanwhile, in response to the execution command with respect to theuser's uttered voice being set, the display apparatus 100 may adjust acritical value of the similar word set as the execution command. Forexample, in response to the similar word set as the target word or atleast one of the similar words in the similar word list being set as theexecution command, the display apparatus 100 decreases the criticalvalue of the similar word set as the execution command by apredetermined value.

Accordingly, the display apparatus 100 may set a target word withrespect to the user's uttered voice by using the adjusted critical valueof each of the similar words in a voice recognition process, therebyimproving a voice recognition rate on the user's uttered voice.

As illustrated in FIG. 2, an interactive system may, for example,include the display apparatus 100 and a voice recognition apparatus 200.In this case, the display apparatus 100 receives an input of the user'suttered voice and transmits a voice signal with respect to the receiveduttered voice to the voice recognition apparatus 200. The voicerecognition apparatus 200 converts the voice signal with respect to theuser's uttered voice received from the display apparatus 100 into, forexample, text and compares a pronunciation of the converted text with apronunciation of each of the plurality of pre-registered words tomeasure the reliability of the plurality of words. Subsequently, thevoice recognition apparatus 200 extracts a plurality of similar wordswhich are similar to the user's uttered voice based on the measuredreliability. The voice recognition apparatus 200 sets a similar wordwhich satisfies a predetermined condition from among the plurality ofextracted similar words as the target word with respect to the user'suttered voice and transmits voice recognition result informationincluding the target word and the other similar words to the displayapparatus 100.

The display apparatus 100 displays the target word and a similar wordlist including the other similar words in a form of a UI by using thevoice recognition result information received from the voice recognitionapparatus 200. In response to a selection command with respect to one ofthe similar words being set as the target word and the similar words inthe similar word list being received, the display apparatus 100 may beconfigured to control the operation of the display apparatus 100 orreceive and display a content from a web server (not shown) based on thereceived selection command.

In response to the execution command with respect to the user's utteredvoice being set, the display apparatus 100 transmits executioninformation on the similar word set as the execution command to thevoice recognition apparatus 200. The voice recognition apparatus 200may, for example, decrease the critical value of the similar word set asthe execution command by a predetermined value based on the receivedexecution information.

The voice recognition apparatus 200 may set the target word with respectto the user's uttered voice using the adjusted critical value of each ofthe similar words in the voice recognition process, thereby improvingthe voice recognition rate on the user's uttered voice.

As above, the interactive system according to an example embodiment hasbeen described schematically.

Components of the display apparatus 100 according to an exampleembodiment will be described in detail below.

FIG. 3 is a block diagram illustrating a display apparatus according toan example embodiment.

Referring to FIG. 3, the display apparatus 100 includes an input unit110, a communicator 120, a voice processor 130, a controller 140, astorage 150, and a display 160.

The input unit 110 in the form of input circuitry, such as, for example,a microphone, receives an input of a user's uttered voice. For example,in response to a user's uttered voice in an analogue form being receivedthrough, for example, a microphone, the input unit 110 samples thereceived uttered voice and converts the uttered voice into a digitalsignal. In this case, when the received uttered voice includes noise dueto a factor in a surrounding environment, it is preferred to remove thenoise and then convert the user's uttered voice into a digital signal.In addition, the input unit 110 may receive diverse user manipulationsand transmit the received user manipulations to the controller 140. Inthis case, the input unit 110 may receive a user manipulation commandthrough a touch pad, a key pad including various function keys, numberkeys, special keys, letter key, etc., touch screen, or the like.

The communicator 120, such as, for example, communication circuitry,performs data communication with a remote controller (not shown) whichcontrols the display apparatus 100 or a web server (not shown). Forexample, the communicator 120 may receive a control signal forcontrolling the display apparatus 100 or a voice signal regarding auser's uttered voice input in the remote controller (not shown) from theremote controller (not shown). In addition, the communicator 120 mayreceive content that the user requested by performing data communicationwith the web server (not shown). In addition, as illustrated inconnection with FIG. 2, in response to the voice recognition on theuser's uttered voice being performed through the voice recognitionapparatus 200, the communicator 120 may transmit the voice signalregarding the user's uttered voice, input through the input unit 110 orreceived through the remote controller (not shown), to the voicerecognition apparatus 200 and receive a voice recognition result on theuser's uttered voice from the voice recognition apparatus 200.

The communicator 120 may include various communication modules orcircuitry, such as a local area wireless communication module (notshown), a wireless communication module (not shown), etc. In this case,the local area wireless communication module (not shown) may, forexample, be a communication module which performs wireless communicationwith an interactive server 200 located in a close range and an externalserver (not shown) which provides a content. For example, the local areawireless communication module (not shown) may be Bluetooth, Zigbee, etc.The wireless communication module (not shown) may, for example, be amodule which is connected to an external network and performscommunication according to a wireless communication protocol such asWireless-Fidelity (Wi-Fi), Institute of electrical and electronicsengineers (IEEE), etc. In addition, the wireless communication module(not shown) may further include a mobile communication module whichaccess a mobile communication network and performs communicationaccording to diverse mobile communication standards such as 3rdGeneration (3G), 3rd Generation Partnership Project (3GPP), Long TermEvolution (LTE), etc.

The voice processor 130 is configured to perform voice recognition on auser's uttered voice input through the input unit 110 or received fromthe remote controller (not shown) through the communicator 120 and toextract voice information. The voice processor 130 is configured tomeasure the reliability with respect to a plurality of pre-registeredwords in the storage 150 based on the extracted voice information andextracts a plurality of similar words which are similar to the user'suttered voice. In this case, the voice information may, for example, bepronunciation information of the text converted through the voicerecognition on the user's uttered voice.

According to an example embodiment, the voice processor 130 may beconfigured to convert the user's uttered voice into text by using an STTalgorithm. For example, in response to an uttered voice “Volume up!”being received, the input unit 110 converts the uttered voice “Volumeup!” into a voice signal in, for example, a digital form. In response tothe uttered voice being converted into the voice signal, the voiceprocessor 130 converts the voice signal regarding the uttered voice“Volume up!” into text.

Subsequently, the voice processor 130 extracts a pronunciation from thetext on the user's uttered voice. In response to the pronunciation beingextracted from the text on the user's uttered voice, the voice processor130 measures the similarity between a pronunciation defined for each ofthe plurality of pre-registered words and the pronunciation with respectto the user's uttered voice and determines, for example, by calculating,a reliability value according to the measured similarity. According toan example embodiment, the voice processor 130 may measure thesimilarity between the pronunciation with respect to the user's utteredvoice and the pronunciation defined for each of the plurality ofpre-registered words and determine the reliability value by using asimilarity algorithm such as, for example, a Confusion Matrix.

In response to the reliability value for each of the plurality of wordsbeing determined, the voice processor 130 may be configured to extract aplurality of similar words which are similar to the user's uttered voicebased on the determined reliability value. According to an exampleembodiment, in response to the reliability value for each of theplurality of words, the voice processor 130 may be configured to extractwords in a predetermined order as a plurality of similar words which aresimilar to the user's uttered voice, from a word having the highestreliability value. According to another example embodiment, in responseto the reliability values of the plurality of words, the voice processor130 may be configured to extract a word having a reliability valuehigher than a predetermined reference value from among the plurality ofwords as a plurality of similar words which are similar to the user'suttered voice.

In addition, in response to the similar words which are similar to theuser's uttered voice being extracted based on the reliability value ofeach of the plurality of words, the voice processor 130 may determinewhether the extracted similar words are grouped into a similar wordgroup. In response to determining that the extracted similar words aregrouped into a particular similar word group, the voice processor 130may extract the other words in the similar word group as a similar word.

The controller 140 may be configured to control overall operations ofthe components of the display apparatus 100. For example, in response toa plurality of similar words which are similar to the user's utteredvoice being extracted through the voice processor 130, the controller140 may be configured to determine whether the plurality of extractedsimilar words are pre-stored as a similar word group. In response todetermining that the plurality of extracted similar words are notgrouped into the same similar word group, the controller 140 may beconfigured to group the plurality of similar words extracted inconnection with the user's uttered voice into the same similar wordgroup and store the similar word group in the storage 150. As describedabove, in response to the similar words which are similar to the user'suttered voice being extracted, the voice processor 130 may extract theother words which are grouped into the same similar word group as theextracted similar words as a similar word.

Meanwhile, the controller 140 may be configured to set a similar wordwhich satisfies a predetermined condition from among the plurality ofsimilar words extracted in connection with the user's uttered voice as atarget word. Subsequently, the controller 140 is configured to controlthe display 160 to display at least one of the similar words set as thetarget word and a similar word list including the similar words otherthan the similar word set as the target word. The display 160 whichdisplays a voice recognition result on the user's uttered voice maydisplay the similar word set as the target word and the similar wordlist in a screen in a UI form according to a control command of thecontroller 140.

For example, in response to the plurality of similar words which aresimilar to the user's uttered voice being extracted, the controller 140may be configured to compare a reliability value of each of theplurality of extracted similar words with a critical value assigned toeach of the plurality of similar words and sets a similar word having areliability value higher than the critical value from among theplurality of extracted similar words as a target word which is the mostsimilar to the user's uttered voice.

For example, in connection with an uttered voice “fitness,” a firstsimilar word “fitness” which is the same as the uttered voice and asecond similar word “business” may be extracted from a plurality ofpre-registered words. The pronunciation of the uttered voice “fitness”and the pronunciation of the first similar word may be‘[#p{i.t{u-.ni.su#],’ and the pronunciation of the second similar word“business” may be ‘[#pi.j-u-.ni.s′u-#].’ In addition, the reliabilityvalue of the first similar word regarding the uttered voice “fitness”may be 100, the reliability value of the second similar word “business”may be 80, and the critical value assigned to the first and secondsimilar words may be set to be 90. In this case, the controller 140 maybe configured to determine that the reliability value of the firstsimilar word out of the first similar word and the second similar wordwhich are similar to the uttered voice “fitness” is higher than thecritical value assigned to the first similar word and set the firstsimilar word as a target word with respect to the user's uttered voice.

As described above, in response to the target word with respect to theuser's uttered voice being set, the display 160 may display a similarword list including the similar word set as the target word and theother similar words in a form of a UI. In this case, the similar wordlist is a list including the similar words other than the similar wordset as the target word from among the plurality of similar wordsextracted in connection with the user's uttered voice and differentsymbolic letters being matched with the other similar words.

While the target word and the similar word list are displayed, thecontroller 140 may be configured to set the target word or at least onesimilar word included in the similar word list as an execution commandbased on a selection command of the user.

According to an example embodiment, in response to an execution commandof the user not being input through the input unit 110 for apredetermined critical time or in response to a selection command withrespect to a target word being received, the controller 140 may beconfigured to set the target word as an execution command. Meanwhile, inresponse to the selection command input from the user being a selectioncommand with respect to the similar word list, the controller 140 mayset a similar word corresponding to the selection command inputted fromthe user from among a plurality of similar words included in the similarword list as an execution command. In this case, the selection commandmay be a user manipulation command through a touch pad, a keypad, atouch screen or the like, or may be a user's uttered voice.

Meanwhile, as described above, the similar word list may include atleast one similar word which is similar to the user's uttered voice anda symbolic letter being matched with the similar word. In response to asimilar word included in the similar word list being selected through anuttered voice, the user is able to perform utterance with respect to thesymbolic letter matched with the similar word that the user wishes toselect. In response to the selection command input through the inputunit 110 being an uttered voice regarding a symbolic letter, thecontroller 140 may, for example, be configured to set a similar wordmatched with a symbolic letter similar to the user's uttered voice fromamong the similar words in the similar word list as the executioncommand. The display apparatus 100 may reduce and/or minimize arecognition error with respect to the selection command regarding thesimilar word included in the similar word list.

As described above, in response to the target word or a similar wordincluded in the similar word list being set as an execution command, thecontroller 140 may be configured to perform a control operation, suchas, channel change, volume control, etc., or receive and display acontent from a web server (not shown) based on the execution command.

In response to the execution command being set from among the pluralityof similar words including the target word, the controller 140 may beconfigured to adjust a critical value assigned to the similar word setas the execution word. According to an example embodiment, in responseto the target word being set as the execution command, the controller140 may be configured to decrease the critical value of the similar wordset as the target word by a predetermined adjustment value. Meanwhile,in response to the similar word included in the similar word list beingset as the execution command, the controller 140 may be configured todecrease the critical value of the similar word set as the executionword by a first predetermined adjustment value and increase the criticalvalue of the similar word set as the target word by a secondpredetermined adjustment value.

As described above, according to an example embodiment, the displayapparatus 100 may set the target word with respect to the user's utteredvoice using the adjusted critical value of each of the similar words inthe voice recognition process, thereby improving the voice recognitionrate on the user's uttered voice.

FIG. 4 is a demonstration diagram illustrating an operation ofdetermining a target word in a display apparatus according to an exampleembodiment.

As illustrated in FIG. 4, in response to an uttered voice “Show me MDC!”being received from a user, for example, three similar words “NDC,”“MDC,” and “ADC” which are similar to the received uttered voice may beextracted, and a reliability value of each of the similar words “NDC,”“MDC,” and “ADC” may be determined.

In response to the three similar words which are similar to the user'suttered voice and the reliability values of the similar words beingobtained, the controller 140 may be configured to compare thereliability value of each of the similar words with the critical valueassigned to each of the similar words and sets a similar word having areliability value higher than the critical value as a target word withrespect to the user's uttered voice.

As illustrated in FIG. 4, the reliability value of the first similarword “NDC” is 4000, and the critical value of the first similar word maybe set to 4200. The reliability value of the second similar word “MDC”is 3800, and the critical value of the second similar word may be set to3600. In addition, the reliability value of the third similar word “ADC”is 3200, and the critical value of the third similar word may be set to4000. The controller 140 may be configured to compare the reliabilityvalues of the first to third similar words with the critical values ofthe first to third similar words. In response to determining that thereliability value of the second similar word “MDC” 410 is higher than apredetermined critical value, the controller 140 may be configured toset the second similar word “MDC” 410 as the target word with respect tothe user's uttered voice.

In response to the target word with respect to the user's uttered voicebeing set, the display 160 displays the second similar word 410 set asthe target word and the similar word list including the first similarword and the third similar word in a screen.

FIGS. 5A-5B are demonstration diagrams illustrating an operation ofdisplaying a voice recognition result on a user's uttered voice in adisplay apparatus according to an example embodiment.

As described above in connection with FIG. 4, in response to the secondsimilar word “MDC” being set as the target word, the display apparatus100 displays a target word 510 and a similar word list 520 in a screenas illustrated in FIG. 5A. The display apparatus 100 displays the secondsimilar word “MDC”, that is, the target word 510, in an upper part ofthe screen and displays the similar word list 520 including the firstsimilar word “NDC” and the third similar word “ADC” in a certain area ina lower part of the screen. In this case, the display apparatus 100 maydisplay the similar word list 520 in which the first similar word “NDC”and the third similar word “ADC” are matched with the symbolic letters,for example, “1” and “2,” respectively.

In response to the execution command not being input from the user for apredetermined critical time or in response to the selection command withrespect to the target word 510 being received while the target word 510and the similar word list 520 are displayed, the display apparatus 100performs a channel change from a currently tuned channel to an “MDC”channel based on the second similar word set as the target word 510.

Meanwhile, the user may intend to change the channel to “NDC,” not“MDC.” In this case, the display apparatus 100 may receive a selectioncommand with respect to one of the first and third similar wordsincluded in the similar word list 520. In this case, the selectioncommand may be a user manipulation command or an uttered voice. In theexample embodiment, it is assumed that the selection command is anuttered voice. In response to the selection command regarding an utteredvoice with respect to one of the first to third similar words includedin the similar word list 520 being received, the display apparatus 100may recognize the received uttered voice and determine whether the userintended to select the first similar word or intended to select thethird word included in the similar word list 520.

As described above, the first similar word “NDC” and the third similarword “ADC” included in the similar word list 520 may be matched with thesymbolic letters “1” and “2,” respectively. The user is able to performthe utterance with respect to the symbolic letters matched with thefirst and third similar words in order to change the channel to achannel corresponding to at least one of the first to third similarwords. For example, in order to change the channel to a “NDC” channelcorresponding to the first similar word, the user is able to perform theutterance with respect to the symbolic letter “1” matched with the firstsimilar word. In response to the user's uttered voice based on theutterance being received, the display apparatus 100 may recognize thereceived uttered voice and determine that the user's uttered voice is aselection command with respect to the symbolic letter “1.” Subsequently,as illustrated in FIG. 5B, the display apparatus 100 displays a voicerecognition result “1” 530 in an area in which the second similar word“MDC”, that is, the target word 510 is displayed. In addition, thedisplay apparatus 100 changes the current channel to the “NDC” channelbased on the first similar word matched with the symbolic letter “1.”

As described above, according to an example embodiment, the displayapparatus 100 may receive a user's uttered voice with respect to thesymbolic letters which are respectively matched with the plurality ofsimilar words included in the similar word list 520, thereby improvingthe recognition rate on the similar word that the user intended fromamong the plurality of similar words included in the similar word list520.

FIG. 6 is a demonstration diagram illustrating an operation of adjustinga critical value of a similar word which is similar to a user's utteredvoice in a display apparatus according to an example embodiment.

As described above in connection with FIGS. 5A-5B, the user is able toselect the first similar word “NDC” from among the first to thirdsimilar words included in the similar word list 520. In this case, thedisplay apparatus 100 may, for example, decrease a critical value 610 ofthe first similar word “NDC” from 4300 to 4000 and increase a criticalvalue 620 of the second similar word of “MDC” set as the target word 510from 3600 to 3800.

However, the example embodiment is not limited thereto. In response tothe second similar word of “MDC” set as the target word 510 beingselected, the display apparatus 100 may decrease the critical value 620,that is, 3600, of the second similar word of “MDC” set as the targetword 510, by a predetermined adjustment value.

As described above, the display apparatus 100 may decrease or increasethe critical value assigned to each of the plurality of similar wordsrecognized from the user's uttered voice based on the selection commandof the user. Accordingly, in response to a user's uttered voice which issimilar to or the same as the uttered voice being received, the displayapparatus 100 may set a target word by using the adjusted critical valueof each of the plurality of similar words extracted in connection withthe user's uttered voice, thereby improving the recognition rate on theuser's uttered voice.

As above, the operation of performing the voice recognition with respectto a user's uttered voice in the display apparatus 100 has beendescribed. A method for recognizing a user's uttered voice in thedisplay apparatus 100 will be described in detail below.

FIG. 7 is a flowchart illustrating a method for recognizing a user'suttered voice in a display apparatus according to an example embodiment.

As illustrated in FIG. 7, in response to a user's uttered voice beingreceived through a remote controller (not shown) or a microphone of thedisplay apparatus 100, the display apparatus 100 extracts voiceinformation from the received uttered voice and measures the reliabilitywith respect to a plurality of pre-registered words based on theextracted voice information (S710, S720). Subsequently, the displayapparatus 100 extracts a plurality of similar words which are similar tothe user's uttered voice based on the measured reliability (S730).

For example, the display apparatus 100 may convert a voice signalregarding the received uttered voice into a voice signal in a digitalform and convert the voice signal in the digital form into text by usingthe STT algorithm. Subsequently, the display apparatus 100 may extract apronunciation from the text regarding the user's uttered voice, measurethe similarity between the extracted pronunciation and a pronunciationof each of the plurality of pre-registered words, and determine areliability value based on the similarity.

According to an example embodiment, the display apparatus 100 maymeasure the similarity between the pronunciation with respect to theuser's uttered voice and the pronunciation of each of the plurality ofpre-registered words and determine the reliability value by using thesimilarity algorithm such as, for example, a Confusion Matrix.

In response to the reliability value of each of the plurality of wordsbeing determined, the display apparatus 100 may extract a plurality ofsimilar words which are similar to the user's uttered voice based on thedetermined reliability value. In this case, when the plurality ofsimilar words which are similar to the user's uttered voice areextracted, it is preferred to group the plurality of extracted similarwords into a similar word group. The display apparatus 100 determineswhether the plurality of the extracted similar words are grouped into asimilar word group. In response to determining that at least one similarword among the plurality of extracted similar words is grouped into asimilar word group, the display apparatus 100 may extract the otherwords in the similar word group as a similar word related to the user'suttered voice.

In response to the plurality of similar words being extracted, thedisplay apparatus 100 sets a similar word which satisfies apredetermined condition from among the plurality of extracted similarwords as a target word (S740). Subsequently, the display apparatus 100displays the similar word set as the target word and a similar word listincluding the other similar words (S750).

While the target word and the similar word list are displayed, thedisplay apparatus 100 sets the similar word set as the target word or atleast one similar word included in the similar word list as an executioncommand according to a selection command of a user (S760). According toan example embodiment, in response to an execution command of the usernot being received for a predetermined critical time or in response to aselection command with respect to the target word not being received,the display apparatus 100 may set the similar word set as the targetword as an execution command. In addition, in response to a selectioncommand with respect to the similar word list being received, thedisplay apparatus 100 may set a similar word corresponding to theselection command from among the plurality of similar words included inthe similar word list as an execution command.

The similar word list is a list including similar words other than thetarget word and different symbolic letters being matched with the othersimilar words. In response to a selection command of the user being anuttered voice related to a symbolic letter, the display apparatus 100may set a similar word which is matched with the symbolic letter similarto the uttered voice from among the similar words in the similar wordlist as an execution command.

In response to the execution command being set, the display apparatus100 adjusts a critical value assigned to the similar word set as theexecution command from among the plurality of similar words includingthe target word (S770). That is, according to an example embodiment, thedisplay apparatus 100 may set a target word with respect to the user'suttered voice by using the adjusted critical value of each of thesimilar words in a voice recognition process, thereby improving a voicerecognition rate on the user's uttered voice.

A method for setting a target word from a plurality of similar wordsextracted in connection with a user's uttered voice in the displayapparatus 100 will be described in detail below.

FIG. 8 is a flowchart illustrating a method for setting a target word ina display apparatus according to an example embodiment.

As illustrated in FIG. 8, in response to a plurality of similar wordswhich are similar to a user's uttered voice being extracted, the displayapparatus 100 selects a first similar word having the highestreliability value from among the plurality of extracted similar words(S810). Subsequently, the display apparatus 100 compares the reliabilityvalue of the first similar word with a critical value assigned to thefirst similar word (S820). In response to determining that thereliability value of the first similar word is higher than thepredetermined critical value of the first similar word, the displayapparatus 100 sets the first similar word as a target word which is themost similar to the user's uttered voice and includes the other similarwords in a similar word list (S830). Meanwhile, in response to thereliability value of the first similar word being lower than thepredetermined critical value of the first similar word, the displayapparatus 100 performs the operations at S820 and S830 again to select asecond similar word having the second highest reliability value andcompares the reliability value of the second similar word with acritical value assigned to the second similar word (S840). In responseto the reliability value of the second similar word being higher thanthe predetermined critical value of the second similar word, the displayapparatus 100 sets the second similar word as the target word. Accordingto the operations, the display apparatus 100 may compare the criticalvalues assigned to each of the plurality of similar words, from asimilar word having the highest reliability value among the plurality ofsimilar words extracted in connection with the user's uttered voice, andset a similar word having a reliability value higher than the criticalvalue as the target word.

FIG. 9 is a flowchart illustrating a method for adjusting a criticalvalue of a similar word in a display apparatus according to an exampleembodiment.

As illustrated in FIG. 9, the display apparatus 100 determines whetherat least one similar word among a plurality of similar words included ina similar word list is set as an execution command (S910). In responseto determining that the similar word included in the similar word listis set as the execution command, the display apparatus 100 decreases acritical value of the similar word set as the execution command by apredetermined adjustment value and increases a critical value of thesimilar word set as a target word by the predetermined adjustment value(S920).

The example embodiment is not limited thereto. That is, in response to asimilar word set as an initial target word being set as an executioncommand, the display apparatus 100 may decrease a critical value of thesimilar word set as the target word by a predetermined adjustment value.

As described above, the display apparatus 100 according to an exampleembodiment may set a target word with respect to a user's uttered voiceby using the adjusted critical value of the similar words in a voicerecognition process, thereby enhancing the voice recognition rate on theuser's uttered voice.

As above, the example embodiments of the disclosure have been described.

The foregoing example embodiments and advantages are merely examples andare not to be construed as limiting. The disclosure can be readilyapplied to other types of devices. Also, the description of the exampleembodiments is intended to be illustrative, and not to limit the scopeof the claims, and many alternatives, modifications, and variations willbe apparent to those skilled in the art.

What is claimed is:
 1. A method for performing voice recognition in adisplay apparatus, comprising: receiving an uttered voice of a user;extracting a plurality of similar words which are similar to the utteredvoice by extracting voice information from the uttered voice anddetermining reliability of a plurality of words based on the extractedvoice information; setting a word satisfying a predetermined conditionfrom among the plurality of extracted similar words as a target wordwith respect to the uttered voice; and displaying at least one of thetarget word and a similar word list including similar words other thanthe target word.
 2. The method as claimed in claim 1, wherein the voiceinformation is pronunciation information on text converted through voicerecognition on the uttered voice.
 3. The method as claimed in claim 2,wherein extracting the plurality of similar words comprises extracting aplurality of similar words which are similar to the uttered voice basedon a reliability value determined based on a similarity between apronunciation defined for each of the plurality of words and apronunciation with respect to the uttered voice, wherein setting theword satisfying the predetermined condition comprises comparing areliability value for each of the plurality of similar words and acritical value for each of the similar words and setting a similar wordhaving a reliability value higher than the critical value for each ofthe similar words as a target word with respect to the uttered voice. 4.The method as claimed in claim 1, further comprising: setting anexecution command, wherein in response to an execution command not beingreceived for a predetermined critical time or in response to a selectioncommand with respect to the target word being received, setting theexecution command comprises setting the target word as an executioncommand, and in response to a selection command with respect to thesimilar word list being received, setting a similar word correspondingto the selection command as an execution command.
 5. The method asclaimed in claim 4, wherein the similar word list is a list includingsimilar words other than the target word and different symbolic lettersbeing matched to correspond with the other similar words, wherein inresponse to the selection command being an uttered voice related to asymbolic letter, setting the execution command comprises setting asimilar word which is matched to correspond with a symbolic lettersimilar to the uttered voice from among the similar words in the similarword list as the execution command.
 6. The method as claimed in claim 4,further comprising: adjusting a critical value assigned to the similarword set as the execution command from among the plurality of similarwords including the target word.
 7. The method as claimed in claim 6,wherein in response to the target word being set as an executioncommand, adjusting the critical value comprises decreasing a criticalvalue of the similar word set as the target word by a predeterminedadjustment value.
 8. The method as claimed in claim 6, wherein inresponse to a similar word included in the similar word list being setas an execution word, adjusting the critical value comprises decreasinga critical value of the similar word set as the execution command by afirst adjustment value and increasing a critical value of the similarword set as the target word by a second adjustment value.
 9. The methodas claimed in claim 1, wherein in response to the plurality of similarwords which are similar to the uttered voice being extracted, extractingthe plurality of similar words comprises grouping the plurality ofextracted similar words into a similar word group.
 10. The method asclaimed in claim 9, wherein in response to the similar words extractedin connection with the uttered voice being grouped into a similar wordgroup, extracting the plurality of similar words comprises extractingall words in the similar word group as a similar word related to theuttered voice.
 11. A display apparatus comprising: input circuitryconfigured to receive an uttered voice of a user; a display configuredto display a voice recognition result of the uttered voice; a voiceprocessor configured to extract a plurality of similar words which aresimilar to the uttered voice by extracting voice information from theuttered voice and determining reliability of a plurality of words basedon the extracted voice information; and a controller configured to set aword satisfying a predetermined condition from among the plurality ofextracted similar words as a target word with respect to the utteredvoice and to control the display to display at least one of the targetword and a similar word list including similar words other than thetarget word.
 12. The apparatus as claimed in claim 11, wherein the voiceinformation is pronunciation information on text converted through voicerecognition on the uttered voice.
 13. The apparatus as claimed in claim12, wherein the voice processor is configured to extract a plurality ofsimilar words which are similar to the uttered voice based on areliability value determined based on a similarity between apronunciation defined for each of the plurality of words and apronunciation with respect to the uttered voice, wherein the controlleris configured to compare a reliability value for each of the pluralityof similar words and a critical value assigned to each of the similarwords and to set a similar word having a reliability value higher thanthe critical value assigned to each of the similar words as a targetword with respect to the uttered voice.
 14. The apparatus as claimed inclaim 11, wherein the controller is configured to set the target word asan execution command in response to an execution command not beingreceived for a predetermined critical time or in response to a selectioncommand with respect to the target word being received, and to set asimilar word corresponding to the selection command as an executioncommand in response to a selection command with respect to the similarword list being received.
 15. The method as claimed in claim 14, whereinthe similar word list is a list including other similar words than thetarget word and different symbolic letters being matched with the othersimilar words, wherein the controller is configured to set a similarword which is matched with a symbolic letter similar to the utteredvoice from among the similar words in the similar word list as anexecution command in response to the selection command being an utteredvoice related to a symbolic letter.
 16. The apparatus as claimed inclaim 14, wherein the controller is configured to adjust a criticalvalue assigned to the similar word set as the execution command fromamong the plurality of similar words including the target word.
 17. Theapparatus as claimed in claim 16, wherein the controller is configuredto decrease a critical value of the similar word set as the target wordby a predetermined adjustment value in response to the target word beingset as an execution command.
 18. The apparatus as claimed in claim 16,wherein the controller is configured to decrease a critical value of thesimilar word set as the execution command by a first adjustment valueand to increase a critical value of the similar word set as the targetword by a second adjustment value in response to a similar word includedin the similar word list being set as an execution word.
 19. Theapparatus as claimed in claim 11, wherein the controller is configuredto control the plurality of extracted similar words into a similar wordgroup in response to the plurality of similar words which are similar tothe uttered voice being extracted.
 20. The apparatus as claimed in claim11, wherein the voice processor is configured to extract all words inthe similar word group as a similar word related to the uttered voice inresponse to the similar words extracted in connection with the utteredvoice being grouped into a similar word group.